Automatic Question Categorization: a New Approach for Text Elaboration

نویسندگان

  • Marcelo Adriano Amâncio
  • Magali Sanches Duran
  • Sandra M. Aluísio
چکیده

Text adaptation is a normal activity of teachers to facilitate reading comprehension of specific contents; the general approaches for it are Text Simplification and Text Elaboration (TE). TE aims at clarifying, explaining information and making connections explicit in texts. In this paper, we present a new approach for TE: an automatic question categorization system which assigns wh-question labels to verbal arguments in a sentence. For example, in ―Mary danced yesterday.‖ ―Who?‖ is the label linking the verb ―danced‖ to the argument ―Mary‖ and ―When?‖ links ―danced‖ to the argument ―yesterday‖. This annotation is similar to semantic role labeling, approached successfully via statistical language processing techniques. Specifically, we present experiments to build the system using a fine-grained question set in Portuguese language and address two key research questions: (1) Which machine-learning algorithm presents the best results? (2) Which problems this task presents and how to overcome them?

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Question Categorization: a New Approach for Text Elaboration Categorización automática de preguntas: un nuevo enfoque para elaboración de textos

Text adaptation is a normal activity of teachers to facilitate reading comprehension of specific contents; the general approaches for it are Text Simplification and Text Elaboration (TE). TE aims at clarifying, explaining information and making connections explicit in texts. In this paper, we present a new approach for TE: an automatic question categorization system which assigns wh-question la...

متن کامل

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Automatic Text Categorization and Its Applicationto Text

We develop an automatic text categorization approach and investigate its application to text retrieval. The categorization approach is derived from a combination of a learning paradigm known as instancebased learning and an advanced document retrieval technique known as retrieval feedback. We demonstrate the e ectiveness of our categorization approach using two real-world document collections f...

متن کامل

Automatic Text Categorization and Its Application to Text Retrieval

ÐWe develop an automatic text categorization approach and investigate its application to text retrieval. The categorization approach is derived from a combination of a learning paradigm known as instance-based learning and an advanced document retrieval technique known as retrieval feedback. We demonstrate the effectiveness of our categorization approach using two realworld document collections...

متن کامل

BoosTexter : A Boosting - based System for Text Categorization

This work focuses on algorithms which learn from examples to perform multiclass text and speech categorization tasks. Our approach is based on a new and improved family of boosting algorithms. We describe in detail an implementation, called BoosTexter, of the new boosting algorithms for text categorization tasks. We present results comparing the performance of BoosTexter and a number of other t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Procesamiento del Lenguaje Natural

دوره 46  شماره 

صفحات  -

تاریخ انتشار 2011